NACSIS Test Collection Workshop (NTCIR-1)
نویسندگان
چکیده
The test collection used in the Workshop consists of more than 330,000 documents and more than half are English-Japanese paired. Although there is a Japanese test collection called BMIRJ2 consisting of 5,080 newspaper articles[2], enhancement of the Japanese test collection in the both aspects of the variety of text types and the scale is needed. We put emphasis on cross-lingual retrieval since it is critical in the internet environment and Japanese scientific information retrieval [3].
منابع مشابه
Asian Language Parsing Evaluated by Hummingbird SearchServerTMat NTCIR-3
Hummingbird submitted ranked result sets for the Chinese, Japanese and Korean Single Language Information Retrieval tracks of the Cross-Language Retrieval Task of the 3rd NII-NACSIS Test Collection for IR Systems Workshop (NTCIR-3). SearchServer 5.3’s segmenter for Asian text, compared to an overlapping n-gram approach, was found to modestly increase precision scores for Japanese, to have a neu...
متن کاملCJK Experiments with Hummingbird SearchServerTM at NTCIR-5
Hummingbird submitted ranked result sets for the Chinese, Japanese and Korean Single Language Information Retrieval subtasks of the Cross-Lingual Information Retrieval Task of the 5th NII-NACSIS Test Collection for IR Systems Workshop (NTCIR-5). For short Chinese (title) queries, a decompounded wordbased approach produced higher (statistically significant) mean average precision and first relev...
متن کاملNTCIR CLIR Experiments at the University of Maryland
This paper presents results for the Japanese/English cross-language information retrieval task on the NACSIS Test Collection. Two automatic dictionarybased query translation techniques were tried with four variants of the queries. The results indicate that longer queries outperform the required descriptiononly queries and that use of the rst translation in the dictionary is comparable with the ...
متن کاملEvaluation -- the Way Ahead : A Case of the NTCIR
Noriko Kando National Institute of Informatics (NII), Tokyo [email protected] Abstract: This paper introduces activities of the cross-lingual information retrieval (CLIR) systems evaluation in the NTCIR (NII-NACSIS Test Collection for Information Retrieval and Text Processing Technologies) project and suggests several axes as a framework describing the nature of CLIR experiments. Finally it menti...
متن کاملThe Very Large Collection and Web Tracks (Preprint version)
Together, the TREC Very Large Collection (VLC) Track and its successor the Web Track have run for seven years, after an initial VLC pre-track. During that time five new test collections have been created, five different types of retrieval task have been studied, a large number of important issues have been addressed, and new methods have been tried, not only for retrieval, but also for test col...
متن کامل